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Abstract. We show that if DTIME[2° (n) ] is not included in DSPACE[2 o(n) ] 
then, for every set B in PSPACE, all strings x in B of length n 
can be represented by a string compreaaed(x) of length at most 
log(|B =n |) + O(logn), such that a polynomial-time algorithm, given 
compressed(x) , can distinguish x from all the other strings in B =n . Mod- 
ulo the O(logn) additive trem, this achieves the information-theoretical 
optimum for string compression. 
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1 Introduction 

In many practical and theoretical applications in computer science, it is impor- 
tant to represent information in a compressed way. If an application handles 
strings x from a finite set B, it is desirable to represent every x by another 
shorter string compressed{x) such that compressed(x) describes unambigously 
the initial x. Regarding the compression rate, ideally, one would like to achieve 
the information-theoretical bound \compressed(x)\ < log(|S|), for all x 6 B. If 
a set B is computably enumerable, a fundamental result in Kolmogorov com- 
plexity states that for all x € B =n , C{x) < log(|B =,l |) + 0(logn), where C(x) is 
the Kolmogorov complexity of x, i.e., the shortest effective description of x (B =n 
is the set of strings of length n in B). The result holds because x can be de- 
scribed by its rank in the enumeration of B =n . However enumeration is typically 
a slow operation and, in many applications, it is desirable that the unambiguous 
description is not merely effective, but also efficient. This leads to the idea of 
considering a time-bounded version of Kolmogorov complexity. An interesting 
line of research [Sip83,BFL01,BLvM05,LR05], which we also pursue in this pa- 
per, focuses on the time-bounded distinguishing Kolmogorov complexity, CD*(-). 
We say that a program p distinguishes x if p accepts x and only x. CD*' A (x) 
is the size of the smallest program that distinguishes x and that runs in time 
t(\x\) with access to the oracle A. Buhrman, Fortnow, and Laplante [BFL01] 
show that for some polynomial p, for every set £?, and every string x g B =n , 
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CD P,B (x) < 2 log(|B =n |) + O(log n). This is an important and very general re- 
sult but the upper bound for the compressed string length is roughly 2 log(|£? =n |) 
instead of log(|£? = ™|), that one may hope. In fact, Buhrman, Laplante, and Mil- 
tersen [BLMOO], show that for some sets B, the factor 2 is necessary. There are 
some results where the upper bound is asymptotically log(|_B = ™|) at the price 
of weakening other parameters. Sipser [Sip83] shows that the upper bound of 
log(|i? =ra |) can be achieved if we allow the distinguisher program to use polyno- 
mial advice: For every set B, there is a string u>b of length poly(n) such that 
for every x G B= n , CD poly ' B= °(x | w B ) < log(|B= n |) + loglog(|B= n |) + 0(1). 
Buhrman, Fortnow, and Laplante [BFL01] show that log(|-B = ™|) can be achieved 
if wc allow a few exceptions: For any B, any e, for all except a fraction of e strings 
x G B= n , CD poly ' B= "(x) < log(|B= n |) +poly log(n- 1/e). Buhrman, Lee, and van 
Melkebeek [BLvM 05] show t hat for all B and x G B= n , CND poly ' B= "(x) < 
log(|B =n |) + 0(( v /log(|B= n |) + logn) logn), wher CND is similar to CD except 
that the distinguisher program is nondeterministic. 

Our main result shows that under a certain reasonable hardness assumption, 
the upper bound of log(|_B = ™|) holds for every set B in PSPACE. 

Main Result. Assume that there exists / £ E that cannot be computed by 
circuits of size 2°(") with PSPACE gates. Then for any B in PSPACE, there 
exists a polynomial p such that for every x G B =n , 

CD p < B= "(x) < log(|B= n |) + O(logn). 

The main result is a corollary of the following stronger result: Under the same 
hardness assumption, the distinguisher program p for x of length log(|_B = "|) + 
O(logn) is simple conditioned by x, in the sense that C poly (p | x) = O(logn), 
where C poly (-) is the polynomial-time bounded Kolmogorov complexity. 

We also consider some variations of the main result in which the set B is in 
P or in NP. Wc show that the hardness assumption can be somewhat weakened 
by replacing the PSPACE gates with gates. We also show that the distin- 
guisher program no longer needs oracle access to B =n in case we allow it to be 
nondeterministic and B is in NP. 

The hardness assumption in the main result, which we call HI, states that 
there exists a function / G E = U c >oDTIME[2 cn ] that cannot be computed by 
circuits of size 2°( n ) with PSPACE gates. This looks like a technical hypothesis; 
however, Miltersen [MilOl] shows that the more intuitive assumption "E is not 
contained in DSPACE[2°(™)]" implies HI. We note that this assumption (or 
related versions) has been used before in somewhat similar contexts. Antunes 
and Fortnow[AF09] use a version of HI (with the PSPACE gates replaced by 
gates) to show that the semi-measure m p (x) = 2^° P ^ dominates all polynomial- 
time samplable distributions. Trevisan and Vadhan [TVOO] use a version of HI 
(with the PSPACE gates replaced by U p gates) to build for each k a polynomial- 
time extractor for all distributions with min-entropy (1 — S)n that are samplable 
by circuits of size n k . 



1.1 Discussion of technical aspects 



We present the main ideas in the proof of the main result. The method is 
reminiscent of techniques used in the construction of Kolmogorov extractors 
in [FHP+06,HPV09,ZimlO]. Let fi in PSPACE. To simplify the argument sup- 
pose that | B =n | is a power of two, say | B =n \ = 2 k . If we would have a polynomial- 
time computable function E : {0, 1}" —¥ {0, l} fc , whose restriction on B is 1-to-l, 
then every x £ B =n could be distinguished from the other elements of B =n by 
z = E{x) and we would obtain CD poly ' B= °(x) < |z| + 0(1) = log(|B= n |) + 0(1). 
We do not know how to obtain such a function E, but, fortunately, we can afford 
a slack additive term of 0(log n) and therefore we can relax the requirements for 
E. We can consider functions E of type E : {0, 1}™ x {0, l} log ™ -> {0, l} k . More 
importantly, it is enough if E is computable in polynomial time given an advice 
string a of length O(logn) and if every z £ {0, l} fc has at most 0{n) preimages 
in B x {0,l} log ™. With such an E, the string z = £(x,0 log ") distinguishes x 
from strings that do not map into z and, using the general result of Buhrman, 
Fortnow, and Laplante [BFL01] , with additional 2 log n+0(l) bits we can distin- 
guish x from the other at most 0{n) strings that map into z. With such an E, we 
obtain for every x £ B= n the desired CD poly ' B= "(x) < |z| + \<r\ +21ogn + 0(l) = 
log(|B= n |) + 0(logn). 

Now it remains to build the function E. An elementary use of the probabilistic 
method shows that if we take E : {0, 1}" x {0, l} log ™ -> {0, l} fc at random, with 
high probability, every z £ {0, l} fc has at most 7n preimages. The problem is that 
to compute a random E in polynomial-time we would need its table and the table 
of such a function has size poly(iV), where N = 2". This is double exponentially 
larger than C*(log n) which has to be the size of a from our discussion above. 

To reduce the size of advice information (that makes E computable in poly- 
nomial time) from poly(AT) to O(logn), we derandomize the probabilistic con- 
struction in two steps. 

In the first step we observe that counting (the number of preimages 
of z) can be done with sufficient accuracy by circuits of size poly(iV) 
and constant-depth using the result of Ajtai [Ajt93]. We infer that there 
exists a circuit G of size poly(iV) and constant depth such that {E 
every z has < In preimages infix {0, l} log "} C {E \ G{E) = 1} C {E \ 
every z has < 8n preimages in B x {0, l} log "}. Now we can utilize the Nisan- 
Wigderson [NW94] pseudo-random generator NW-gen against constant-depth 
circuits and we obtain that, for most seeds s (which we call good seeds for 
NW-gen), NW-gen(s) is the table of a function E where each element z £ {0, l} k 
has at most 8n preimages infix {0, l} lo s n . This method is inspired by the work 
of Musatov [MuslO], and it has also been used in [Zimll]. The seed s has size 
polylog(A) = poly(n), which is not short enough. 

In the second step we use the Impagliazzo-Wigderson pseudo-random gener- 
ator [IW97] as generalized by Klivans and van Melkebeek [KvM02] . We observe 
that checking that a seed s is good for NW-gen can be done in PSPACE, and we 
use the hardness assumption to infer the existence of a pseudo-random generator 
H such that for most seeds o of H (which we call good seeds for H), H{a) is a 



good seed for NW-gen. We have \a\ = log \s\ = O(logra) as desired. Finally, we 
take our function E to be the function whose table is NW-gen(_ff(cr)), for some 
good seed a for H. It follows that, given a, E is computable in polynomial time 
and that every z £ {0, l} fc has at most 8n preimages in B =n x {0, l} logTl . 

The idea of the 2-step derandomization has been used by Antunes and Fort- 
now [AF09] and later by Antunes, Fortnow, Pinto and Souza [AFPS07]. 

2 Preliminaries 

2.1 Notation and basic facts on Kolmogorov complexity 

We work over the binary alphabet {0, 1}. A string is an element of {0, 1}*. If 
X is a string, x denotes its length; if B is a finite set, \B\ denotes its size. If 
B C {0, 1}*, then B= n = {x G B | |x| = n}. 

The Kolmogorov complexity of a string x is the length of the shortest program 
that prints x. The t-time bounded Kolmogorov complexity of a string x is the 
length of the shortest program that prints x in at most t{\x\) steps. The t-time 
bounded distinguishing Kolmogorov complexity of a string x is the length of the 
shortest program that accepts x and only x and runs in at most t(\x\) steps. The 
formal definitions are as follows. 

We fix an universal Turing machine U, which is able to simulate any other 
Turing machine with only a constant additive term overhead in the program 
length. The Kolmogorov complexity of the string x conditioned by string y, 
denoted C(x | y), is the length of the shortest string p (called a program) such 
that U(p,y) — x. In case y is the empty string, we write C(x). 

For the time-bounded versions of Kolmogorov complexity, we fix an universal 
machine U, that, in addition to the above property, can also simulate any Turing 
machine M in time £m(M) log £m(|x|), where <m(M) is the running time of M 
on input x. For a time bound t(-), the ^-bounded Kolmogorov complexity of x 
conditioned by y, denoted C*(x | y), is the length of the shortest program p such 
that U(p,y) — x and U(p,y) halts in at most t(\x\ + \y\) steps. 

The i-time bounded distinguishing complexity of x conditioned by y, denoted 
CD* (x | y) is the length of the shortest program p such that 

(1) U(p,x,y) accepts, 

(2) U (p, v, y) rejects for all v ^ x, 

(3) U(p 7 v 7 y) halts in at most t(\v\ + \y\) steps for all v and y. 

In case y is the empty string A, we write CD*(x) in place of CD*(x | A). If U 
is an oracle machine, we define in the similar way, CD a (x | y) and CD a (x), 
by allowing U to query the oracle A. 

For defining t-time bounded nondctcrministic distinguishing Kolmogorov 
complexity, we fix U a nondctcrministic universal machine, and we define 
CND*(x | y) in the similar way. 

Strings xi,X2, . . . ,Xk can be encoded in a self-delimiting way (i.e., an en- 
coding from which each string can be retrieved) using xi| + |x 2 | + . . . + \xk\ + 
2 log |x 2 | + . . . + 2 log \xk \ + 0{k) bits. For example, x\ and x 2 can be encoded 



as (6in(|x2|)01a;iX2, where bin{n) is the binary encoding of the natural number 
n and, for a string u = u\ . . . u m , u is the string mui . . . u m u m (i.e., the string 
u with its bits doubled). 

2.2 Distinguishing complexity for strings in an arbitrary set 

As mentioned, Buhrman, Fortnow and Laplante [BFL01], have shown that for 
any set B and for every x € B= n it holds that CD poly ' B= " (x) < 21og(|B= n |) + 
O(logn). We need the following more explicit statement of their result. 

Theorem 1. There exists a polynomial-time algorithm A such that for every 
set B C {0, every n, every x G B =n , there exists a string p x of length 
\Px\ < 21og(|B = ™|) + 0(logn) such that 

• A(p x ,x) = accept, 

• A(p x , y) = reject, for every y G B~ a — {x}. 

2.3 Approximate counting via polynomial-size constant-depth 
circuits 

We will need to do counting with constant-depth polynomial-size circuits. Aj- 
tai [Ajt93] has shown that this can be done with sufficient precision. 

Theorem 2. (Ajtai's approximate counting with polynomial size 
constant-depth circuits,) There exists a uniform family of circuits {G„}„ e N ; 
of polynomial size and constant depth, such that for every n, for every 
x G {0, 1}™, for every a G {0, . . . , n — 1}, and for every e > 0, 

• If the number of 1 's in x is < (1 — e)a, then G n (x, a, 1/e) = 1, 

• If the number of 1 's in x is > (1 + e)a, then G n (x, a, 1/e) = 0. 

We do not need the full strength (namely, the uniformity of G n ) of this theorem; 
the required level of accuracy (just e > 0) can be achieved by non-uniform 
polynomial-size circuits of depth d = 3 (with a much easier proof, see [ViolO]). 

2.4 Pseudo-random generator fooling bounded-size constant-depth 
circuits 

The first step in the derandomization argument in the proof of Theorem 5 is 
done using the Nisan-Wigderson pseudo-random generator that "fools" constant- 
depth circuits [NW94]. 

Theorem 3 (Nisan-Wigderson pseudo random generator). For every 
constant d there exists a constant a > with the following property. There 
exists a function NW-gen : 

| 0)1 |O(log 2 + 6 n) ^ {0,1}" such that for any circuit 

G of size 2™° and depth d, 

\Prob se{01}0(log 2, + e n) [G(NW-gen(s)) = 1] - Prob ze{0 , 1} „ [G(z) = 1]| < 1/100. 

Moreover, there is a procedure that on inputs (n, i, s) produces the i-th bit of 
NW-gen(s) in time poly (log n). 



2.5 Hardness assumptions and pseudo-random generators 

The second step in the derandomization argument in the proof of Theorem 5 
uses pseudo-random generators based on the assumption that hard functions ex- 
ist in E — U C DTIME[2 C "]. Such pseudo-random generators were constructed by 
Nisan and Wigderson [NW94]. Impagliazzo and Wigderson [IW97] strenghten 
the results in [NW94] by weakening the hardness assumptions. Klivans and van 
Melkcbeek [KvM02] show that the Impagliazzo- Wigderson results hold in rel- 
ativized worlds. We use in this paper some instantiations of a general result 
in [KvM02]. 

We need the following definition. For a function / : {0, 1}* — > {0, 1} and 
an oracle A, the circuit complexity Cf{n) of / at length n relative to A is the 
smallest integer t such that there exists an A oracle circuit of size t that computes 
/ on inputs of length n. 

We use the following hardness assumptions. 

Assumption HI: 

There exists / G E such that for some e > and for some PSPACE complete 
set A, Cf{n) > 2 e n . 

Assumption H2: 

There exists / £ E such that for some e > and for some complete set 
A, Cf{n) > 2 e ". 

If HI holds, then for some oracle A that is PSPACE complete, for every k, 
there exists H : {0, l} clo s" — > {0, 1}" computable in time poly(n) such that for 
every oracle circuit C of size n k , 

\Prob aemylo& 4C A (H(a)) = 1] - Prob seRl} „ [C A (s) = 1]| < o(l). 

If H2 holds, then for some oracle A that is Zf complete, for every k, there exists 
H : {0, l} cl °g™ — > {0, 1}" computable in time poly(n) such that for every oracle 
circuit C of size n k , 

iProb^tci^icn^^)) = 1] -Prob seRl} „[C A ( S ) = 1]| < o(l). 
3 Main Result 

Theorem 4. Assuming HI, the following holds: For every set B in PSPACE, 
there exists a polynomial p such that for every length n, and for every string 
x E B =n , there exists a string z with the following properties: 

(XH*i = riog(iB="i)i, 

(2) C p {z \x) =0(Iogn), 

(3) CD p ' B " n (x | z) = O(logn). 

Before proving the theorem, we note that (1) and (3) immediately imply the 
following theorem, which is our main result. 



Theorem 5. Assumming HI, the following holds: For every set B in PSPACE 
there exists a polynomial p, such that for every length n, and for every string 

x e B= n , 

CD p ' B ~ n (x) < log(|B= n |) + O(logn). 

Proof (of Theorem 4) Let us fix a set B in PSPACE and let us focus on B =n , 
the set of strings of length n in B. Let k — [log |-B = "|] and let K = 2 k . Also, let 
N = 2". 

Definition 1. Let E : {0,1}" x {0, l} log " -> {0,l} fc . We say that E is A- 
balanced if for every z € {0, l} m , 

o— n I ^ 

l-E -1 ^) n B= n x {0, l} log "| < A ■ J — 

The plan for the proof is as follows. Suppose that we have a function E : 
{0, 1}" x {0, l} log ™ -> {0, l} fe that is Z\-balanced, for some constant A. 

Furthermore assume that E can be "described" by a string a, in the sense 
that given a as an advice string, E is computable in time polynomial in n. 

Fix x in B =n . and let z = E(x, log "). Clearly, the string z satisfies require- 
ment (1). It remains to show (2) and (3). 

Consider the set 

U = {u e B= n | 3v e {0, l} logI \ E(u, v) = z). 

Since E is Z\-balanced, the size of U is bounded by A ■ {B < An. 
Now observe that for some polynomial p, 

CD p ' B= "(x | z) < |er| + 21og(An) + O(logn) + self-delimitation overhead. 

Indeed, the following is a polynomial-time algorithm using oracle B =n that dis- 
tinguishes x (it uses an algorithm A, promised by Theorem 1, that distinguishes 
x from the other strings in U, using a string p x of length 2 log(|?7 =Tl |)+0(log n) < 
2\og(An) +0(logn)). 

Input: y; the strings z,o~,p x , defined above, are also given. 
If y $ B~ n , then reject. 

If for all v G {0, l} Io s™, E(y,v) ^ z, then reject. 
If A(y,p x ) — reject, then reject. 
Else accept. 



Clearly, the algorithm accepts input y iff y = x. Also, since z — E(x, log "), 
C p (z | x) < \a\ + 0(1). For a further application (Theorem 14), note that the 
above algorithm queries the oracle B =n a single time. 

Therefore, if we manage to achieve a — O(logn), we obtain that CD P ' B (x 
z) < O(logn) and C p (z \ x) < O(logn). 

Thus our goal is to produce a function E : {0, 1}™ x {0, l} log ™ -> {0, l} fc that 
using an advice string a of length O(logn) is computable in polynomial time 
and is Z\-balanced for some constant A. Let us call this goal (*). 



We implement the ideas exposed in Section 1.1 and the reader may find 
convenient to review that discussion. 



Claim 6 . With probability at least 0.99, a random function E : {0,1}" x 
{0, l} log ™ -> {0, l} fc is 7-balanced. 

Proof. For fixed x E B, y G {0, l} lo s™, z E {0, l} m , if we take a random function 
E : {0,1}" x {0, l} lo s" -> {0,l} fe , we have that Prob[£(x, y) = z] = l/K. 
Thus the expected number of preimages of z in the rectangle B x {0, l} lo s" is 
/i = (l/K) ■ \B\ ■ n. By the Chernoff's bounds, 1 

Prob[numbcr of preimages of z in B x {0, l} log " > 7/x] < e -( 61n2 ^. 

Therefore, the probability of the event "there is some z E {0, l} fc such that the 
number of z-cells in B x {0, l} log " is > 7^" is at most K ■ e -^ a2 ^ < 0.01. | 

Claim 7 . There exists a circuit G of size poly(iV) and constant depth such that 
for any function E : {0, 1}" x {0, l} lo s" ->. {0, l} k (whose table is given to E as 
the input), 

(a) If G(E) = 1, then E is ^-balanced, 

(b) If E is 7-balanced, then G(E) = 1. 

Proof. By Theorem 2, there is a constant-depth, poly(iV) size circuit that counts 
in an approximate sense the occurrences of a string z in {0, l} fc in the rectangle 
B x {0, l} lo s™. If W e make a copy of this circuit for each z E {0, l} k and link all 
these copies to an AND gate we obtain the desired circuit G. 

More precisely, let x z be the binary string of length \B\ ■ n, whose bits are 
indexed as (u, v ) for uE B,v E {0, l} lo s", and whose (it, i>)-bit is 1 iff E(u, v) = 
z. By Theorem 2, there is a constant-depth, poly(iV) size circuit G' that behaves 
as follows: 

• G'(x z ) = 1 if the number of l's in x z is < 7 ^'" , 

• G'(x z ) = if the number of l's in x z is > 8 ^' ra . 

If the number of l's is between the two bounds, the circuit G' outputs either 
or 1. 

The circuit G on input a table of E, will first build the string x z for each 
z E {0, l} fc , then has a copy of G' for each z, with the z-th copy running on x z 
and then connects the outputs of all the copies to an AND gate, which is the 
output gate of G. 

I 

1 We use the following version of the Chernoff bound. If X is a sum of inde- 
pendent Bernoulli random variables, and the expected value E[X] — then 
Prob[X > (1 + A)fi] < e - A( - ln{A/3))l ". The standard Chernoff inequality Prob(X > 
(1 + A)fj] < ( ) M i s presented in many textbooks. It can be checked easily 
tnat m + < e~ Alu< > A ^\ which implies the above form. 



Claim 8 . If we pick at random a function E : {0, 1}" x {0, l} log " -> {0, l} fe , 
with probability at least 0.99 ; G(E) = 1. 

Proof. This follows from Claim 6 and from Claim 7 (b). 1 

Let N — N ■ n ■ k. Let d be the depth of the circuit G. We denote n = 
log 2d+6 N. Note that n = poly(n). We consider the Nisan-Wigderson pseudo- 
random generator for depth d given by Theorem 3. Thus, 

NW-gen : {0,1}" -> {0,1}*. 

For any string s of length h, we view NW-gen(s) as the table of a function 
E : {0,1}™ x {0,l} log " — > {0, l} fe . 

Claim 9 . If we pick at random s G n, with probability of s at least 0.9, it holds 
that NW-gen(s) is 8-balanced. 

Proof. Since G is a circuit of constant depth and polynomial size, by Theorem 3, 
the probability of the event "G(NW-gen(s)) = 1" is within 0.01 of the proba- 
bility of the event "G(E) = 1," and the second probability is at least 0.99 by 
Claim 8. Thus the first probability is greater than 0.9. Taking into account that 
if G(NW-gen(s)) = 1 then NW-gen(s) is 8-balanced, the conclusion follows. { 

Claim 10 . Let T = {s G {0,1}" | NW-gen(s) is 8-balanced}. Then T is in 
PS PACE. 

Proof. We need to count for every z G {0, l} fe , the number of z-cells in the 
rectangle B= n x {0, l} log " of the table of NW-gen(s). Since B is in PSPACE and 
since each entry in the table of NW-gen(s) can be computed in time polynomial 
in n, it follows that the above operation can be done in PSPACE. jj. 

Claim 11 . Assume HI. There exists a constant c and a function H : 
{0, l} clogn — > {0,1}", computable in time poly(n) = poly(n), such that if a is 
a string chosen at random in {0, l} clog ™, with probability at least 0.8, it holds 
that NW-gen(H(o~))) is 8-balanced. 

Proof. Under assumption HI, there exists a pseudo-random generator H : 
{0, l} clog " -> {0, 1}" such that for any set A in PSPACE, 

|Prob <7e{0 ,i}eio««[ff(<r) G A] - Prob se{0 ,i } n[s G A]| < 0.1. 

Since the set T is in PSPACE, Prob CTe{0 ,i}=io g n [NW-gen(i? (<r)) is 8-balanccd] is 
within 0.1 from Prob se { .i}" [NW-gen(s) is 8-balanced]. Since the latter proba- 
bility is at least 0.9, the conclusion follows. 

I 

We can now finish the proof of Theorem 5. 

Fix a G {0, l} clog " such that NW-gen(#(er))) is 8-balanced. Note that \a\ = 
O(logn). Given a, each entry in the table of NW-gen((if (cr))) can be computed 
in time poly(n). Thus the function E : {0,1}" x {0,l} log ™ -)■ {0,l} fc , whose 
table is NW-gen((ff (o - ))), satisfies the goal (*). | 



4 Variations around the main result 



We analyze here the polynomial-time bounded distinguishing Kolmogorov of 
strings in a set B that is in P or in NP. We start with the case BeP. The 
following is the analog of Theorem 4 and its main point is that assumption HI 
can be replaced by the presumably weaker assumption H2. 

Theorem 12. Assuming H2, the following holds: For every set B in P , there 
exists a polynomial p such that, for every length n, and for every string x £ B =n , 
there exists a string z with the following properties: 

(XH*i = rio g (iB="i)i, 

(2) C p (z | x) = O(logn), 

(3) CD p (x | z) = O(logn). 

Proof. We follow the proof of Theorem 4. First note that since B £ P, the 
universal machine does not need oracle access to B. We still need to justify that 
assumption HI can be replaced by the weaker assumption H2. 

Assumption HI was used in Claim 11. The point was that the set T = 
{s | NW-gen(s) is 8-balanced} is in PSPACE and HI was used to infer the 
existence of a pseudo-random generator H that fools T. If B £ P, we can check 
that NW-gcn(s) is sufficiently balanced using less computational power than 
PSPACE. Basically we need to check that for all z £ {0, l} fc , 

\B =n \ ■ n 

|NW-gen(s)- 1 (z) n B= n x {0, l} log ™| < A ■ l - J , 

K 

for some constant A. Using Sipser's method from [Sip83], there is a predicate 
R such that 

• R(s, z) = 1 implies |NW-gen(s)" 1 (z) n B= n x {0, l} lo s"| < 16 • n, 

• R(s, z)=Q implies |NW-gen(s)- 1 (z) n B= n x {0, l} lo s"| > 64 • n. 

Thus there is a set T C {0, 1}" in such that for all s £ T', NW-gen(s) is 
64-balanced and T' contains all s such that NW-gen(s) is 8-balanced. Note that 
the second property implies that \T'\ > 0.99 • 2™. 

Now assumption H2 implies that there exists a pseudo-random generator 
H : {0,l} clog (") -)• {0,1}" that fools T. In particular it follows that with 
probability of a £ {0, l} cl °s(») at least 0.8, H(a) £ V and thus NW-gcn(#(») 
is 64-balanced. The rest of the proof is identical with the proof of Theorem 5. 

I 

The next result is the analog of Theorem 5 for the case when the set B is in P. 

Theorem 13. Assuming H2, the following holds: For every B £ P, there exists 
a polynomial p such that for all n, and for all x £ B =n , 

CD poly (x) < log(|B= n |) + O(logn). 

Proof. This is an immediate consequence of (1) and (3) in Theorem 12. i 



Next we consider the case when the set B is in NP. The main point is that the 
assumption HI can be replaced by H2, and that the distinguishing program does 
not need access to the oracle B =n provided it is nondcterministic. 

Theorem 14. Assuming H2, the following holds: For every set B in NP, there 
exists a polynomial p such that for every length n, and for every string x S B =n , 
there exists a string z with the following properties: 

(1) \z\= flog(|B="|)l, 

(2) C p (z \x) = O(logn), 

(3) CD p ' B= "(x | z) = O(logn). 

(4) CND p (x | z) = O(logn). 

Proof. (1), (2) and (3). We only need to show that in the proof of Theorem 4, 
in case B e NP, the assumption HI can be replaced by the weaker assumption 
H2. This is done virtually in the same way as in the proof of Theorem 13. The 
predicate R also needs this time to check that certain strings are in B and this 
involves an additional quantifier, but that quantifier can be merged with the 
existing quantifiers and R remains in E%. 

(4). We need to show that, at the price of replacing CD by CND, the use 
of the oracle B =n is no longer necessary. Note that the distinguisher procedure 
given in the proof of Theorem 4, queries the oracle only once, and if the answer 
to that query is NO, then the algorithm rejects immediately. Thus, instead of 
making the query, a nondeterministic distinguisher can just guess a witness for 
the single query it makes. 1 

The following is the analog of Theorem 5 in case the set B is in NP. 

Theorem 15. Assuming H2, the following holds: 

(a) For every B G NP, there exists a polynomial p, such that for all n, and 
for all x e B= n , 

CD p < B= "(x) < log(|B= n |) + O(logn). 

(b) For every B <G NP, there exists a polynomial p, such that for all n, and 
for all x 6 B= n , 

CND p (x) < log(|B= n |) + O(logn). 

Proof. Statement (a) follows from (1) and (3) in Theorem 14, and (b) follows 
from (1) and (4) in Theorem 14. 1 
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